Improving computational protein design by using structure-derived sequence profile.
نویسندگان
چکیده
Designing a protein sequence that will fold into a predefined structure is of both practical and fundamental interest. Many successful, computational designs in the last decade resulted from improved understanding of hydrophobic and polar interactions between side chains of amino acid residues in stabilizing protein tertiary structures. However, the coupling between main-chain backbone structure and local sequence has yet to be fully addressed. Here, we attempt to account for such coupling by using a sequence profile derived from the sequences of five residue fragments in a fragment library that are structurally matched to the five-residue segments contained in a target structure. We further introduced a term to reduce low complexity regions of designed sequences. These two terms together with optimized reference states for amino-acid residues were implemented in the RosettaDesign program. The new method, called RosettaDesign-SR, makes a 12% increase (from 34 to 46%) in fraction of proteins whose designed sequences are more than 35% identical to wild-type sequences. Meanwhile, it reduces 8% (from 22% to 14%) to the number of designed sequences that are not homologous to any known protein sequences according to psi-blast. More importantly, the sequences designed by RosettaDesign-SR have 2-3% more polar residues at the surface and core regions of proteins and these surface and core polar residues have about 4% higher sequence identity to wild-type sequences than by RosettaDesign. Thus, the proteins designed by RosettaDesign-SR should be less likely to aggregate and more likely to have unique structures due to more specific polar interactions.
منابع مشابه
Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches
DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...
متن کاملDirect prediction of profiles of sequences compatible with a protein structure by neural networks with fragment-based local and energy-based nonlocal profiles.
Locating sequences compatible with a protein structural fold is the well-known inverse protein-folding problem. While significant progress has been made, the success rate of protein design remains low. As a result, a library of designed sequences or profile of sequences is currently employed for guiding experimental screening or directed evolution. Sequence profiles can be computationally predi...
متن کاملDesign and Production of Recombinant TAT Protein Structure, Catalytic Domain of Diphtheria Toxin, and Evaluation of Its Effect on Cell Line
Background and Objectives: Cancer is one of the most deadly diseases in the present age and its conventional therapies have had low success. Toxin therapy of cancer is a new therapeutic approach, which has attracted the attention of pharmaceutical specialists. Diphtheria toxin consists of three functional, transducing, and binding domains, that the functional part inhibits protein synthesis and...
متن کاملComputer Aided Molecular Modeling Of Membrane Metalloprotease
Molecular modeling is a set of computational techniques for construction of 3D structure of a protein especially membrane bound proteins whose structures can not be elucidated using experimental techniques. These techniques has been applied in the study of membrane metalloproteases for comparing wild and mutated enzymes, docking inhibitors in the catalytic site and examination of binding pocket...
متن کاملA Mathematical Model for Cell Formation in CMS Using Sequence Data
Cell formation problem in Cellular Manufacturing System (CMS) design has derived the attention of researchers for more than three decades. However, use of sequence data for cell formation has been the least investigated area. Sequence data provides valuable information about the flow patterns of various jobs in a manufacturing system. This paper presents a new mathematical model to solve a cell...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proteins
دوره 78 10 شماره
صفحات -
تاریخ انتشار 2010